Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compaction support in robustness test #17833

Merged
merged 1 commit into from
Jun 7, 2024

Conversation

serathius
Copy link
Member

No description provided.

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@serathius serathius force-pushed the robustness-compact branch 6 times, most recently from f87ed59 to 7ae772b Compare April 26, 2024 09:55
@serathius serathius force-pushed the robustness-compact branch 2 times, most recently from 713985e to 46e255e Compare May 8, 2024 10:03
@serathius
Copy link
Member Author

Need to fix one more case, allow gofailpoints triggered by compaction to allow to be triggered by traffic.

@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

Yey, revision returned by compaction is not linearizable.

Screenshot from 2024-05-21 21-00-38

https://github.com/etcd-io/etcd/actions/runs/9179653925?pr=17833

cc @MadhavJivrajani @fuweid @siyuanfoundation

@siyuanfoundation
Copy link
Contributor

siyuanfoundation commented May 21, 2024

https://github.com/etcd-io/etcd/actions/runs/9179653925?pr=17833

I don't understand why the line is drawn at the end of the compact op in the TestRobustnessExploratory_Etcd_HighTraffic_ClusterOfSize3 report.
There is a valid path from Compact(314), rev: 318 -> Compact(297), rev: 319 -> put("key0", "308"), rev: 319 -> ...
It seems Compact(297), rev: 319 should have returned ErrCompacted but didn't?

Screenshot 2024-05-21 at 2 45 06 PM

@serathius
Copy link
Member Author

Assuming that compaction returns linearizable revision (that might not be necessarily a property we want to provide), operations Compact(314), rev: 318 and Compact(297), rev: 319 should not coexist. Operations execute order should follow returned revision, and it should not be possible to get both success in sequence Compact(314) and Compact(297), because it doesn't make sense to compact revision 297 after 314.

As mentioned above, it's not necessarily a bug, but at least a undocumented behavior. It stems from fact that etcd puts revision in the same field of response, independent of its meaning. Depending on the request the revision can be interpreted in a different way. Some examples:

  • For PUT/TXN operations, response revision is the revision of the operation. Linearizable
  • For Read request with revision=0, the response revision is the read. Linearizable
  • For Read request with revision!=0, the response revision is the latest revision in cluster. Linearizable
  • For Watch request, for response with events it's the revision of local member, for bookmarks it's revision of the watch stream.Non-linearizable.
  • For other request I expect two cases:
    • Request goes through raft, revision is cluster revision. Linearizable. I expected Compaction is here, but it might not be.
    • Request doesn't require raft, revision is local. Non-linearizable.

@serathius
Copy link
Member Author

Here is even clearer example of this
image

@serathius serathius force-pushed the robustness-compact branch 2 times, most recently from b1032f6 to e87f65b Compare May 28, 2024 20:38
@serathius serathius force-pushed the robustness-compact branch 2 times, most recently from 2974288 to 1512732 Compare May 29, 2024 18:00
@serathius
Copy link
Member Author

/retest

1 similar comment
@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

/retest

@serathius serathius force-pushed the robustness-compact branch 2 times, most recently from 0079120 to f417c44 Compare May 30, 2024 08:58
@k8s-ci-robot k8s-ci-robot added the github_actions Pull requests that update GitHub Actions code label May 30, 2024
@serathius serathius force-pushed the robustness-compact branch 3 times, most recently from b73a516 to 4ac50b1 Compare June 1, 2024 08:28
@serathius serathius marked this pull request as ready for review June 1, 2024 08:53
@serathius
Copy link
Member Author

Ready for review, finally.
cc @siyuanfoundation @fuweid @MadhavJivrajani

func (t triggerCompact) Available(e2e.EtcdProcessClusterConfig, e2e.EtcdProcess) bool {
func (t triggerCompact) Available(config e2e.EtcdProcessClusterConfig, _ e2e.EtcdProcess) bool {
// Since introduction of compaction into traffic, injecting compaction failpoints started interfeering with peer proxy.
// TODO: Re-enable the peer proxy for compact failpoints when we confirm the root cause.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue tracking this?

Copy link
Member Author

@serathius serathius Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, I think #16788 might be related.

I was also hoping that maybe this will be auto resolved with #17938 as major rewrite might change this.

@serathius
Copy link
Member Author

serathius commented Jun 6, 2024

ping @fuweid @MadhavJivrajani

@serathius
Copy link
Member Author

/retest

@serathius serathius merged commit b38d863 into etcd-io:main Jun 7, 2024
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/robustness-testing area/testing github_actions Pull requests that update GitHub Actions code
Development

Successfully merging this pull request may close these issues.

4 participants